Utopia: a Load Sharing Facility for Large, Heterogeneous Distributed Computer Systems
نویسندگان
چکیده
Load sharing in large heterogeneous distributed systems allows users to access vast amount of computing resources scattered around the system and may provide substantial performance improvements to applications We discuss the design and implementation issues in Utopia a load sharing facility speci cally built for large and heterogeneous systems The system has no restriction on the types of tasks that can be remotely exe cuted involves few application changes and no operating system change supports a high degree transparency for remote task execution and incurs low overhead The algorithms for managing resource load information and task placement take advantage of the clus tering nature of large scale distributed systems centralized algorithms are used within host clusters and directed graph algorithms are used among the clusters to make Utopia scalable to thousands of hosts Task placements in Utopia exploit the heterogeneous hosts and consider varying resource demands of the tasks A range of mechanisms for remote execution is available in Utopia that provides varying degrees of transparency and e ciency A number of applications have been developed for Utopia ranging from load sharing command interpreter to parallel and distributed applications to distributed batch facil ity For example an enhanced UNIX command interpreter allows arbitrary commands and user jobs to be executed remotely and a parallel make facility achieves speedups of or more by processing a collection of tasks in parallel on a number of hosts Such per formance is substantially better than that achieved using kernel based process migration in experimental operating systems such as Sprite For correspondence contact Songnian Zhou at CSRI University of Toronto King s College Road Toronto Ontario CANADA M S A Tel Email zhou white toronto edu Pierre Delisle is currently with Sun Microsystems Inc Boul Dr Fredrik Philips Bureau Saint Laurent QC Canada H M X
منابع مشابه
LSBATCH: A Distributed Load Sharing Batch System
Batch processing, a primary mode of computing in mainframes and supercomputers, is becoming important for networked systems as the computing environments become more and more distributed. In this paper, we discuss the architectural and design considerations , and some important implementation issues of Lsbatch, a distributed batch system for large{scale, heterogeneous computer systems. Lsbatch ...
متن کاملAdaptive Dynamic Data Placement Algorithm for Hadoop in Heterogeneous Environments
Hadoop MapReduce framework is an important distributed processing model for large-scale data intensive applications. The current Hadoop and the existing Hadoop distributed file system’s rack-aware data placement strategy in MapReduce in the homogeneous Hadoop cluster assume that each node in a cluster has the same computing capacity and a same workload is assigned to each node. Default Hadoop d...
متن کاملEffective Load Sharing on Heterogeneous Networks of Workstations
We consider networks of workstations which are not only timesharing, but also heterogeneous with a large variation in the computing power and memory capacities of different workstations. Many load sharing schemes mainly target sharing CPU resources, and have been intensively evaluated in homogeneous distributed environments. However, the penalties of data accesses and movement in modern compute...
متن کاملPerformance of Hierarchical Load Sharing in Heterogeneous Distributed Systems
− Performance of distributed systems can be improved by load sharing (i.e., distributing load from heavily loaded nodes to lightly loaded ones). Dynamic load sharing policies take system state into account in making job distribution decisions. The state information can be maintained in one of two basic ways: distributed or centralized. Two examples of distributed policies are the sender-initiat...
متن کاملComparison of Static and Dynamic Load Balancing in Grid Computing
Grid computing consists of a number of heterogeneous systems sharing resources toward a common goal. Management of resource and Workload are two main functions by Grid. Grid computing is to create large and powerful self-managing virtual computer out of a large collection of connected heterogeneous systems sharing various combinations of resources. Load balancing is used to improve scalability ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Softw., Pract. Exper.
دوره 23 شماره
صفحات -
تاریخ انتشار 1993